Overview

Dataset statistics

Number of variables15
Number of observations21208
Missing cells10876
Missing cells (%)3.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.6 MiB
Average record size in memory80.9 B

Variable types

Categorical13
Unsupported1
Numeric1

Alerts

image has constant value "image.png"Constant
question has a high cardinality: 9122 distinct valuesHigh cardinality
hint has a high cardinality: 4652 distinct valuesHigh cardinality
category has a high cardinality: 127 distinct valuesHigh cardinality
skill has a high cardinality: 379 distinct valuesHigh cardinality
lecture has a high cardinality: 262 distinct valuesHigh cardinality
solution has a high cardinality: 12952 distinct valuesHigh cardinality
pid is highly overall correlated with answer and 5 other fieldsHigh correlation
answer is highly overall correlated with pidHigh correlation
task is highly overall correlated with pidHigh correlation
grade is highly overall correlated with pidHigh correlation
subject is highly overall correlated with pid and 1 other fieldsHigh correlation
topic is highly overall correlated with pid and 1 other fieldsHigh correlation
split is highly overall correlated with pidHigh correlation
hint is highly imbalanced (52.2%)Imbalance
task is highly imbalanced (83.4%)Imbalance
image has 10876 (51.3%) missing valuesMissing
pid is uniformly distributedUniform
pid has unique valuesUnique
choices is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-09-17 07:41:59.147843
Analysis finished2023-09-17 07:46:37.418922
Duration4 minutes and 38.27 seconds
Software versionpandas-profiling v3.6.6
Download configurationconfig.json

Variables

question
Categorical

Distinct9122
Distinct (%)43.0%
Missing0
Missing (%)0.0%
Memory size165.8 KiB
Think about the magnetic force between the magnets in each pair. Which of the following statements is true?
 
599
Which country is highlighted?
 
468
What is the name of the colony shown?
 
398
Will these magnets attract or repel each other?
 
317
Compare the average kinetic energies of the particles in each sample. Which sample has the higher temperature?
 
277
Other values (9117)
19149 

Length

Max length862
Median length605
Mean length69.660081
Min length14

Characters and Unicode

Total characters1477351
Distinct characters93
Distinct categories13 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7891 ?
Unique (%)37.2%

Sample

1st rowWhich of these states is farthest north?
2nd rowIdentify the question that Tom and Justin's experiment can best answer.
3rd rowIdentify the question that Kathleen and Bryant's experiment can best answer.
4th rowWhich figure of speech is used in this text? Sing, O goddess, the anger of Achilles son of Peleus, that brought countless ills upon the Achaeans. —Homer, The Iliad
5th rowWhich of the following could Gordon's test show?

Common Values

ValueCountFrequency (%)
Think about the magnetic force between the magnets in each pair. Which of the following statements is true? 599
 
2.8%
Which country is highlighted? 468
 
2.2%
What is the name of the colony shown? 398
 
1.9%
Will these magnets attract or repel each other? 317
 
1.5%
Compare the average kinetic energies of the particles in each sample. Which sample has the higher temperature? 277
 
1.3%
Which of the following contains a vague pronoun reference? 261
 
1.2%
Which continent is highlighted? 228
 
1.1%
Which property do these three objects have in common? 194
 
0.9%
Which closing is correct for a letter? 157
 
0.7%
Which sentence states a fact? 151
 
0.7%
Other values (9112) 18158
85.6%

Length

2023-09-17T13:16:37.549023image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the 22165
 
8.6%
is 11293
 
4.4%
which 10432
 
4.1%
of 9168
 
3.6%
a 7897
 
3.1%
what 5859
 
2.3%
in 5688
 
2.2%
this 3981
 
1.6%
to 2774
 
1.1%
following 2642
 
1.0%
Other values (9622) 174859
68.1%

Most occurring characters

ValueCountFrequency (%)
229978
15.6%
e 146361
 
9.9%
t 118329
 
8.0%
i 91665
 
6.2%
a 90727
 
6.1%
h 86948
 
5.9%
o 86730
 
5.9%
s 80556
 
5.5%
n 73438
 
5.0%
r 61321
 
4.2%
Other values (83) 411298
27.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1160372
78.5%
Space Separator 229978
 
15.6%
Uppercase Letter 39728
 
2.7%
Other Punctuation 37110
 
2.5%
Control 5577
 
0.4%
Dash Punctuation 1392
 
0.1%
Decimal Number 1223
 
0.1%
Close Punctuation 969
 
0.1%
Open Punctuation 969
 
0.1%
Other Symbol 17
 
< 0.1%
Other values (3) 16
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 146361
12.6%
t 118329
10.2%
i 91665
 
7.9%
a 90727
 
7.8%
h 86948
 
7.5%
o 86730
 
7.5%
s 80556
 
6.9%
n 73438
 
6.3%
r 61321
 
5.3%
c 48286
 
4.2%
Other values (24) 276011
23.8%
Uppercase Letter
ValueCountFrequency (%)
W 16364
41.2%
S 2968
 
7.5%
I 2700
 
6.8%
C 2418
 
6.1%
T 2088
 
5.3%
A 1493
 
3.8%
B 1408
 
3.5%
M 1392
 
3.5%
D 1306
 
3.3%
H 1113
 
2.8%
Other values (16) 6478
 
16.3%
Other Punctuation
ValueCountFrequency (%)
? 18734
50.5%
. 9766
26.3%
, 4349
 
11.7%
' 3159
 
8.5%
" 720
 
1.9%
! 220
 
0.6%
: 69
 
0.2%
; 56
 
0.2%
% 25
 
0.1%
& 12
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
0 383
31.3%
1 229
18.7%
2 135
 
11.0%
5 117
 
9.6%
9 82
 
6.7%
7 68
 
5.6%
8 57
 
4.7%
4 55
 
4.5%
3 54
 
4.4%
6 43
 
3.5%
Dash Punctuation
ValueCountFrequency (%)
- 1219
87.6%
— 165
 
11.9%
– 8
 
0.6%
Close Punctuation
ValueCountFrequency (%)
) 967
99.8%
] 2
 
0.2%
Open Punctuation
ValueCountFrequency (%)
( 967
99.8%
[ 2
 
0.2%
Space Separator
ValueCountFrequency (%)
229978
100.0%
Control
ValueCountFrequency (%)
5577
100.0%
Other Symbol
ValueCountFrequency (%)
° 17
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 11
100.0%
Nonspacing Mark
ValueCountFrequency (%)
Ì© 4
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1200100
81.2%
Common 277247
 
18.8%
Inherited 4
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 146361
12.2%
t 118329
 
9.9%
i 91665
 
7.6%
a 90727
 
7.6%
h 86948
 
7.2%
o 86730
 
7.2%
s 80556
 
6.7%
n 73438
 
6.1%
r 61321
 
5.1%
c 48286
 
4.0%
Other values (50) 315739
26.3%
Common
ValueCountFrequency (%)
229978
83.0%
? 18734
 
6.8%
. 9766
 
3.5%
5577
 
2.0%
, 4349
 
1.6%
' 3159
 
1.1%
- 1219
 
0.4%
) 967
 
0.3%
( 967
 
0.3%
" 720
 
0.3%
Other values (22) 1811
 
0.7%
Inherited
ValueCountFrequency (%)
Ì© 4
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1477132
> 99.9%
Punctuation 173
 
< 0.1%
None 42
 
< 0.1%
Diacriticals 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
229978
15.6%
e 146361
 
9.9%
t 118329
 
8.0%
i 91665
 
6.2%
a 90727
 
6.1%
h 86948
 
5.9%
o 86730
 
5.9%
s 80556
 
5.5%
n 73438
 
5.0%
r 61321
 
4.2%
Other values (71) 411079
27.8%
Punctuation
ValueCountFrequency (%)
— 165
95.4%
– 8
 
4.6%
None
ValueCountFrequency (%)
° 17
40.5%
Å‚ 6
 
14.3%
ż 6
 
14.3%
é 6
 
14.3%
ñ 3
 
7.1%
ø 1
 
2.4%
á 1
 
2.4%
è 1
 
2.4%
í 1
 
2.4%
Diacriticals
ValueCountFrequency (%)
Ì© 4
100.0%

choices
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size165.8 KiB

answer
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size165.8 KiB
1
8542 
0
8399 
2
2961 
3
1275 
4
 
31

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters21208
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 8542
40.3%
0 8399
39.6%
2 2961
 
14.0%
3 1275
 
6.0%
4 31
 
0.1%

Length

2023-09-17T13:16:37.720190image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-17T13:16:37.899235image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
1 8542
40.3%
0 8399
39.6%
2 2961
 
14.0%
3 1275
 
6.0%
4 31
 
0.1%

Most occurring characters

ValueCountFrequency (%)
1 8542
40.3%
0 8399
39.6%
2 2961
 
14.0%
3 1275
 
6.0%
4 31
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 21208
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 8542
40.3%
0 8399
39.6%
2 2961
 
14.0%
3 1275
 
6.0%
4 31
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 21208
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 8542
40.3%
0 8399
39.6%
2 2961
 
14.0%
3 1275
 
6.0%
4 31
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21208
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 8542
40.3%
0 8399
39.6%
2 2961
 
14.0%
3 1275
 
6.0%
4 31
 
0.1%

hint
Categorical

HIGH CARDINALITY  IMBALANCE 

Distinct4652
Distinct (%)21.9%
Missing0
Missing (%)0.0%
Memory size165.8 KiB
10988 
Select the better estimate.
 
432
Select the best estimate.
 
413
The images below show two pairs of magnets. The magnets in different pairs do not affect each other. All the magnets shown are made of the same material.
 
323
Select the best answer.
 
291
Other values (4647)
8761 

Length

Max length1910
Median length0
Mean length97.676113
Min length0

Characters and Unicode

Total characters2071515
Distinct characters93
Distinct categories13 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3872 ?
Unique (%)18.3%

Sample

1st row
2nd rowThe passage below describes an experiment. Read the passage and then follow the instructions below. Tom placed a ping pong ball in a catapult, pulled the catapult's arm back to a 45° angle, and launched the ball. Then, Tom launched another ping pong ball, this time pulling the catapult's arm back to a 30° angle. With each launch, his friend Justin measured the distance between the catapult and the place where the ball hit the ground. Tom and Justin repeated the launches with ping pong balls in four more identical catapults. They compared the distances the balls traveled when launched from a 45° angle to the distances the balls traveled when launched from a 30° angle. Figure: a catapult for launching ping pong balls.
3rd rowThe passage below describes an experiment. Read the passage and then follow the instructions below. Kathleen applied a thin layer of wax to the underside of her snowboard and rode the board straight down a hill. Then, she removed the wax and rode the snowboard straight down the hill again. She repeated the rides four more times, alternating whether she rode with a thin layer of wax on the board or not. Her friend Bryant timed each ride. Kathleen and Bryant calculated the average time it took to slide straight down the hill on the snowboard with wax compared to the average time on the snowboard without wax. Figure: snowboarding down a hill.
4th row
5th rowPeople can use the engineering-design process to develop solutions to problems. One step in the process is testing if a potential solution meets the requirements of the design. The passage below describes how the engineering-design process was used to test a solution to a problem. Read the passage. Then answer the question below. Gordon was an aerospace engineer who was developing a parachute for a spacecraft that would land on Mars. He needed to add a vent at the center of the parachute so the spacecraft would land smoothly. However, the spacecraft would have to travel at a high speed before landing. If the vent was too big or too small, the parachute might swing wildly at this speed. The movement could damage the spacecraft. So, to help decide how big the vent should be, Gordon put a parachute with a 1 m vent in a wind tunnel. The wind tunnel made it seem like the parachute was moving at 200 km per hour. He observed the parachute to see how much it swung. Figure: a spacecraft's parachute in a wind tunnel.

Common Values

ValueCountFrequency (%)
10988
51.8%
Select the better estimate. 432
 
2.0%
Select the best estimate. 413
 
1.9%
The images below show two pairs of magnets. The magnets in different pairs do not affect each other. All the magnets shown are made of the same material. 323
 
1.5%
Select the best answer. 291
 
1.4%
The diagrams below show two pure samples of gas in identical closed, rigid containers. Each colored ball represents one gas particle. Both samples have the same number of particles. 277
 
1.3%
Select the better answer. 241
 
1.1%
The objects are identical except for their temperatures. 195
 
0.9%
Two magnets are placed as shown. 193
 
0.9%
Use the data to answer the question below. 178
 
0.8%
Other values (4642) 7677
36.2%

Length

2023-09-17T13:16:38.085114image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the 29124
 
8.1%
a 13317
 
3.7%
of 11025
 
3.1%
in 7977
 
2.2%
to 7275
 
2.0%
is 6665
 
1.9%
and 6078
 
1.7%
for 4348
 
1.2%
below 3901
 
1.1%
two 3356
 
0.9%
Other values (8272) 265869
74.1%

Most occurring characters

ValueCountFrequency (%)
340133
16.4%
e 226275
 
10.9%
t 142377
 
6.9%
a 141631
 
6.8%
o 123073
 
5.9%
s 118012
 
5.7%
i 102594
 
5.0%
r 100821
 
4.9%
n 100234
 
4.8%
h 87668
 
4.2%
Other values (83) 588697
28.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1616982
78.1%
Space Separator 340133
 
16.4%
Other Punctuation 48830
 
2.4%
Uppercase Letter 47338
 
2.3%
Control 9948
 
0.5%
Decimal Number 4459
 
0.2%
Dash Punctuation 1867
 
0.1%
Close Punctuation 843
 
< 0.1%
Open Punctuation 843
 
< 0.1%
Other Symbol 244
 
< 0.1%
Other values (3) 28
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 226275
14.0%
t 142377
 
8.8%
a 141631
 
8.8%
o 123073
 
7.6%
s 118012
 
7.3%
i 102594
 
6.3%
r 100821
 
6.2%
n 100234
 
6.2%
h 87668
 
5.4%
l 72669
 
4.5%
Other values (22) 401628
24.8%
Uppercase Letter
ValueCountFrequency (%)
T 11560
24.4%
S 4798
10.1%
A 3671
 
7.8%
I 2875
 
6.1%
F 2518
 
5.3%
H 2443
 
5.2%
R 2241
 
4.7%
B 2084
 
4.4%
C 1961
 
4.1%
P 1824
 
3.9%
Other values (16) 11363
24.0%
Other Punctuation
ValueCountFrequency (%)
. 33586
68.8%
, 9726
 
19.9%
: 3370
 
6.9%
' 1572
 
3.2%
" 216
 
0.4%
! 139
 
0.3%
% 135
 
0.3%
/ 54
 
0.1%
? 23
 
< 0.1%
; 9
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
0 1546
34.7%
1 836
18.7%
2 502
 
11.3%
5 481
 
10.8%
3 260
 
5.8%
4 239
 
5.4%
6 198
 
4.4%
7 169
 
3.8%
8 132
 
3.0%
9 96
 
2.2%
Dash Punctuation
ValueCountFrequency (%)
- 1852
99.2%
— 12
 
0.6%
‑ 2
 
0.1%
– 1
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 839
99.5%
] 4
 
0.5%
Open Punctuation
ValueCountFrequency (%)
( 839
99.5%
[ 4
 
0.5%
Math Symbol
ValueCountFrequency (%)
+ 2
50.0%
− 2
50.0%
Space Separator
ValueCountFrequency (%)
340133
100.0%
Control
ValueCountFrequency (%)
9948
100.0%
Other Symbol
ValueCountFrequency (%)
° 244
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 20
100.0%
Final Punctuation
ValueCountFrequency (%)
’ 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1664320
80.3%
Common 407195
 
19.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 226275
13.6%
t 142377
 
8.6%
a 141631
 
8.5%
o 123073
 
7.4%
s 118012
 
7.1%
i 102594
 
6.2%
r 100821
 
6.1%
n 100234
 
6.0%
h 87668
 
5.3%
l 72669
 
4.4%
Other values (48) 448966
27.0%
Common
ValueCountFrequency (%)
340133
83.5%
. 33586
 
8.2%
9948
 
2.4%
, 9726
 
2.4%
: 3370
 
0.8%
- 1852
 
0.5%
' 1572
 
0.4%
0 1546
 
0.4%
) 839
 
0.2%
( 839
 
0.2%
Other values (25) 3784
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2071197
> 99.9%
None 297
 
< 0.1%
Punctuation 19
 
< 0.1%
Math Operators 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
340133
16.4%
e 226275
 
10.9%
t 142377
 
6.9%
a 141631
 
6.8%
o 123073
 
5.9%
s 118012
 
5.7%
i 102594
 
5.0%
r 100821
 
4.9%
n 100234
 
4.8%
h 87668
 
4.2%
Other values (71) 588379
28.4%
None
ValueCountFrequency (%)
° 244
82.2%
á 18
 
6.1%
Å‚ 12
 
4.0%
ż 12
 
4.0%
ñ 9
 
3.0%
ë 1
 
0.3%
é 1
 
0.3%
Punctuation
ValueCountFrequency (%)
— 12
63.2%
’ 4
 
21.1%
‑ 2
 
10.5%
– 1
 
5.3%
Math Operators
ValueCountFrequency (%)
− 2
100.0%

image
Categorical

CONSTANT  MISSING 

Distinct1
Distinct (%)< 0.1%
Missing10876
Missing (%)51.3%
Memory size165.8 KiB
image.png
10332 

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters92988
Distinct characters8
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowimage.png
2nd rowimage.png
3rd rowimage.png
4th rowimage.png
5th rowimage.png

Common Values

ValueCountFrequency (%)
image.png 10332
48.7%
(Missing) 10876
51.3%

Length

2023-09-17T13:16:38.251222image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-17T13:16:38.404105image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
image.png 10332
100.0%

Most occurring characters

ValueCountFrequency (%)
g 20664
22.2%
i 10332
11.1%
m 10332
11.1%
a 10332
11.1%
e 10332
11.1%
. 10332
11.1%
p 10332
11.1%
n 10332
11.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 82656
88.9%
Other Punctuation 10332
 
11.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
g 20664
25.0%
i 10332
12.5%
m 10332
12.5%
a 10332
12.5%
e 10332
12.5%
p 10332
12.5%
n 10332
12.5%
Other Punctuation
ValueCountFrequency (%)
. 10332
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 82656
88.9%
Common 10332
 
11.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
g 20664
25.0%
i 10332
12.5%
m 10332
12.5%
a 10332
12.5%
e 10332
12.5%
p 10332
12.5%
n 10332
12.5%
Common
ValueCountFrequency (%)
. 10332
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 92988
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
g 20664
22.2%
i 10332
11.1%
m 10332
11.1%
a 10332
11.1%
e 10332
11.1%
. 10332
11.1%
p 10332
11.1%
n 10332
11.1%

task
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size21.0 KiB
closed choice
20404 
yes or no
 
600
true-or false
 
204

Length

Max length13
Median length13
Mean length12.886835
Min length9

Characters and Unicode

Total characters273304
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowclosed choice
2nd rowclosed choice
3rd rowclosed choice
4th rowclosed choice
5th rowclosed choice

Common Values

ValueCountFrequency (%)
closed choice 20404
96.2%
yes or no 600
 
2.8%
true-or false 204
 
1.0%

Length

2023-09-17T13:16:38.566887image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-17T13:16:38.752082image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
closed 20404
47.4%
choice 20404
47.4%
yes 600
 
1.4%
or 600
 
1.4%
no 600
 
1.4%
true-or 204
 
0.5%
false 204
 
0.5%

Most occurring characters

ValueCountFrequency (%)
c 61212
22.4%
o 42212
15.4%
e 41816
15.3%
21808
 
8.0%
s 21208
 
7.8%
l 20608
 
7.5%
h 20404
 
7.5%
i 20404
 
7.5%
d 20404
 
7.5%
r 1008
 
0.4%
Other values (7) 2220
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 251292
91.9%
Space Separator 21808
 
8.0%
Dash Punctuation 204
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 61212
24.4%
o 42212
16.8%
e 41816
16.6%
s 21208
 
8.4%
l 20608
 
8.2%
h 20404
 
8.1%
i 20404
 
8.1%
d 20404
 
8.1%
r 1008
 
0.4%
y 600
 
0.2%
Other values (5) 1416
 
0.6%
Space Separator
ValueCountFrequency (%)
21808
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 204
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 251292
91.9%
Common 22012
 
8.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
c 61212
24.4%
o 42212
16.8%
e 41816
16.6%
s 21208
 
8.4%
l 20608
 
8.2%
h 20404
 
8.1%
i 20404
 
8.1%
d 20404
 
8.1%
r 1008
 
0.4%
y 600
 
0.2%
Other values (5) 1416
 
0.6%
Common
ValueCountFrequency (%)
21808
99.1%
- 204
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 273304
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 61212
22.4%
o 42212
15.4%
e 41816
15.3%
21808
 
8.0%
s 21208
 
7.8%
l 20608
 
7.5%
h 20404
 
7.5%
i 20404
 
7.5%
d 20404
 
7.5%
r 1008
 
0.4%
Other values (7) 2220
 
0.8%

grade
Categorical

Distinct12
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size21.2 KiB
grade4
3544 
grade5
3086 
grade3
3032 
grade7
2749 
grade8
2546 
Other values (7)
6251 

Length

Max length7
Median length6
Mean length6.0724727
Min length6

Characters and Unicode

Total characters128785
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowgrade2
2nd rowgrade8
3rd rowgrade7
4th rowgrade11
5th rowgrade8

Common Values

ValueCountFrequency (%)
grade4 3544
16.7%
grade5 3086
14.6%
grade3 3032
14.3%
grade7 2749
13.0%
grade8 2546
12.0%
grade6 2450
11.6%
grade2 1678
7.9%
grade10 558
 
2.6%
grade11 539
 
2.5%
grade9 491
 
2.3%
Other values (2) 535
 
2.5%

Length

2023-09-17T13:16:38.928069image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
grade4 3544
16.7%
grade5 3086
14.6%
grade3 3032
14.3%
grade7 2749
13.0%
grade8 2546
12.0%
grade6 2450
11.6%
grade2 1678
7.9%
grade10 558
 
2.6%
grade11 539
 
2.5%
grade9 491
 
2.3%
Other values (2) 535
 
2.5%

Most occurring characters

ValueCountFrequency (%)
g 21208
16.5%
r 21208
16.5%
a 21208
16.5%
d 21208
16.5%
e 21208
16.5%
4 3544
 
2.8%
5 3086
 
2.4%
3 3032
 
2.4%
7 2749
 
2.1%
8 2546
 
2.0%
Other values (5) 7788
 
6.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 106040
82.3%
Decimal Number 22745
 
17.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 3544
15.6%
5 3086
13.6%
3 3032
13.3%
7 2749
12.1%
8 2546
11.2%
6 2450
10.8%
1 2171
9.5%
2 2118
9.3%
0 558
 
2.5%
9 491
 
2.2%
Lowercase Letter
ValueCountFrequency (%)
g 21208
20.0%
r 21208
20.0%
a 21208
20.0%
d 21208
20.0%
e 21208
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 106040
82.3%
Common 22745
 
17.7%

Most frequent character per script

Common
ValueCountFrequency (%)
4 3544
15.6%
5 3086
13.6%
3 3032
13.3%
7 2749
12.1%
8 2546
11.2%
6 2450
10.8%
1 2171
9.5%
2 2118
9.3%
0 558
 
2.5%
9 491
 
2.2%
Latin
ValueCountFrequency (%)
g 21208
20.0%
r 21208
20.0%
a 21208
20.0%
d 21208
20.0%
e 21208
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 128785
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
g 21208
16.5%
r 21208
16.5%
a 21208
16.5%
d 21208
16.5%
e 21208
16.5%
4 3544
 
2.8%
5 3086
 
2.4%
3 3032
 
2.4%
7 2749
 
2.1%
8 2546
 
2.0%
Other values (5) 7788
 
6.0%

subject
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size21.0 KiB
natural science
11487 
language science
5371 
social science
4350 

Length

Max length16
Median length15
Mean length15.048142
Min length14

Characters and Unicode

Total characters319141
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowsocial science
2nd rownatural science
3rd rownatural science
4th rowlanguage science
5th rownatural science

Common Values

ValueCountFrequency (%)
natural science 11487
54.2%
language science 5371
25.3%
social science 4350
 
20.5%

Length

2023-09-17T13:16:39.083496image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-17T13:16:39.270057image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
science 21208
50.0%
natural 11487
27.1%
language 5371
 
12.7%
social 4350
 
10.3%

Most occurring characters

ValueCountFrequency (%)
e 47787
15.0%
c 46766
14.7%
n 38066
11.9%
a 38066
11.9%
s 25558
8.0%
i 25558
8.0%
l 21208
6.6%
21208
6.6%
u 16858
 
5.3%
t 11487
 
3.6%
Other values (3) 26579
8.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 297933
93.4%
Space Separator 21208
 
6.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 47787
16.0%
c 46766
15.7%
n 38066
12.8%
a 38066
12.8%
s 25558
8.6%
i 25558
8.6%
l 21208
7.1%
u 16858
 
5.7%
t 11487
 
3.9%
r 11487
 
3.9%
Other values (2) 15092
 
5.1%
Space Separator
ValueCountFrequency (%)
21208
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 297933
93.4%
Common 21208
 
6.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 47787
16.0%
c 46766
15.7%
n 38066
12.8%
a 38066
12.8%
s 25558
8.6%
i 25558
8.6%
l 21208
7.1%
u 16858
 
5.7%
t 11487
 
3.9%
r 11487
 
3.9%
Other values (2) 15092
 
5.1%
Common
ValueCountFrequency (%)
21208
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 319141
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 47787
15.0%
c 46766
14.7%
n 38066
11.9%
a 38066
11.9%
s 25558
8.0%
i 25558
8.0%
l 21208
6.6%
21208
6.6%
u 16858
 
5.3%
t 11487
 
3.6%
Other values (3) 26579
8.3%

topic
Categorical

Distinct26
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size22.1 KiB
biology
4098 
physics
3215 
geography
2956 
writing-strategies
1650 
figurative-language
1260 
Other values (21)
8029 

Length

Max length33
Median length22
Mean length11.760892
Min length5

Characters and Unicode

Total characters249425
Distinct characters24
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowgeography
2nd rowscience-and-engineering-practices
3rd rowscience-and-engineering-practices
4th rowfigurative-language
5th rowscience-and-engineering-practices

Common Values

ValueCountFrequency (%)
biology 4098
19.3%
physics 3215
15.2%
geography 2956
13.9%
writing-strategies 1650
7.8%
figurative-language 1260
 
5.9%
chemistry 1194
 
5.6%
earth-science 1152
 
5.4%
science-and-engineering-practices 924
 
4.4%
units-and-measurement 870
 
4.1%
reference-skills 724
 
3.4%
Other values (16) 3165
14.9%

Length

2023-09-17T13:16:39.405263image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
biology 4098
19.3%
physics 3215
15.2%
geography 2956
13.9%
writing-strategies 1650
7.8%
figurative-language 1260
 
5.9%
chemistry 1194
 
5.6%
earth-science 1152
 
5.4%
science-and-engineering-practices 924
 
4.4%
units-and-measurement 870
 
4.1%
reference-skills 724
 
3.4%
Other values (16) 3165
14.9%

Most occurring characters

ValueCountFrequency (%)
e 25968
 
10.4%
i 25405
 
10.2%
s 19624
 
7.9%
g 19500
 
7.8%
r 16199
 
6.5%
a 15727
 
6.3%
o 14556
 
5.8%
n 14464
 
5.8%
c 13915
 
5.6%
t 13303
 
5.3%
Other values (14) 70764
28.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 239262
95.9%
Dash Punctuation 10163
 
4.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 25968
10.9%
i 25405
10.6%
s 19624
 
8.2%
g 19500
 
8.2%
r 16199
 
6.8%
a 15727
 
6.6%
o 14556
 
6.1%
n 14464
 
6.0%
c 13915
 
5.8%
t 13303
 
5.6%
Other values (13) 60601
25.3%
Dash Punctuation
ValueCountFrequency (%)
- 10163
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 239262
95.9%
Common 10163
 
4.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 25968
10.9%
i 25405
10.6%
s 19624
 
8.2%
g 19500
 
8.2%
r 16199
 
6.8%
a 15727
 
6.6%
o 14556
 
6.1%
n 14464
 
6.0%
c 13915
 
5.8%
t 13303
 
5.6%
Other values (13) 60601
25.3%
Common
ValueCountFrequency (%)
- 10163
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 249425
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 25968
 
10.4%
i 25405
 
10.2%
s 19624
 
7.9%
g 19500
 
7.8%
r 16199
 
6.5%
a 15727
 
6.3%
o 14556
 
5.8%
n 14464
 
5.8%
c 13915
 
5.6%
t 13303
 
5.3%
Other values (14) 70764
28.4%

category
Categorical

Distinct127
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size46.6 KiB
State capitals
 
1475
Literary devices
 
1260
Genes to traits
 
958
Units and measurement
 
845
Classification
 
835
Other values (122)
15835 

Length

Max length35
Median length29
Mean length17.36241
Min length4

Characters and Unicode

Total characters368222
Distinct characters57
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)< 0.1%

Sample

1st rowGeography
2nd rowDesigning experiments
3rd rowDesigning experiments
4th rowLiterary devices
5th rowEngineering practices

Common Values

ValueCountFrequency (%)
State capitals 1475
 
7.0%
Literary devices 1260
 
5.9%
Genes to traits 958
 
4.5%
Units and measurement 845
 
4.0%
Classification 835
 
3.9%
Materials 804
 
3.8%
Reference skills 697
 
3.3%
Sentences, fragments, and run-ons 677
 
3.2%
Designing experiments 646
 
3.0%
Magnets 617
 
2.9%
Other values (117) 12394
58.4%

Length

2023-09-17T13:16:39.600897image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
and 6252
 
12.8%
state 1475
 
3.0%
capitals 1475
 
3.0%
traits 1338
 
2.7%
literary 1260
 
2.6%
devices 1260
 
2.6%
classification 1101
 
2.3%
geography 998
 
2.0%
genes 958
 
2.0%
to 958
 
2.0%
Other values (195) 31636
64.9%

Most occurring characters

ValueCountFrequency (%)
e 41254
11.2%
a 33131
 
9.0%
n 30189
 
8.2%
i 28720
 
7.8%
t 28364
 
7.7%
s 27802
 
7.6%
27503
 
7.5%
r 19519
 
5.3%
c 17379
 
4.7%
o 16158
 
4.4%
Other values (47) 98203
26.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 314598
85.4%
Space Separator 27503
 
7.5%
Uppercase Letter 22404
 
6.1%
Other Punctuation 2947
 
0.8%
Dash Punctuation 704
 
0.2%
Decimal Number 54
 
< 0.1%
Control 12
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 41254
13.1%
a 33131
10.5%
n 30189
9.6%
i 28720
9.1%
t 28364
9.0%
s 27802
8.8%
r 19519
 
6.2%
c 17379
 
5.5%
o 16158
 
5.1%
l 12805
 
4.1%
Other values (15) 59277
18.8%
Uppercase Letter
ValueCountFrequency (%)
S 3211
14.3%
C 2055
 
9.2%
A 1959
 
8.7%
M 1757
 
7.8%
G 1561
 
7.0%
P 1446
 
6.5%
L 1260
 
5.6%
E 1240
 
5.5%
R 1153
 
5.1%
D 1120
 
5.0%
Other values (12) 5642
25.2%
Decimal Number
ValueCountFrequency (%)
1 27
50.0%
9 9
 
16.7%
2 9
 
16.7%
0 9
 
16.7%
Other Punctuation
ValueCountFrequency (%)
, 2372
80.5%
: 487
 
16.5%
' 88
 
3.0%
Space Separator
ValueCountFrequency (%)
27503
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 704
100.0%
Control
ValueCountFrequency (%)
12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 337002
91.5%
Common 31220
 
8.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 41254
12.2%
a 33131
9.8%
n 30189
 
9.0%
i 28720
 
8.5%
t 28364
 
8.4%
s 27802
 
8.2%
r 19519
 
5.8%
c 17379
 
5.2%
o 16158
 
4.8%
l 12805
 
3.8%
Other values (37) 81681
24.2%
Common
ValueCountFrequency (%)
27503
88.1%
, 2372
 
7.6%
- 704
 
2.3%
: 487
 
1.6%
' 88
 
0.3%
1 27
 
0.1%
12
 
< 0.1%
9 9
 
< 0.1%
2 9
 
< 0.1%
0 9
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 368222
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 41254
11.2%
a 33131
 
9.0%
n 30189
 
8.2%
i 28720
 
7.8%
t 28364
 
7.7%
s 27802
 
7.6%
27503
 
7.5%
r 19519
 
5.3%
c 17379
 
4.7%
o 16158
 
4.4%
Other values (47) 98203
26.7%

skill
Categorical

Distinct379
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size52.6 KiB
Use guide words
 
697
Inherited and acquired traits: use evidence to support a statement
 
585
Compare physical and chemical changes
 
447
Read a map: cardinal directions
 
400
Compare magnitudes of magnetic forces
 
399
Other values (374)
18680 

Length

Max length100
Median length63
Mean length37.75811
Min length4

Characters and Unicode

Total characters800774
Distinct characters64
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique128 ?
Unique (%)0.6%

Sample

1st rowRead a map: cardinal directions
2nd rowIdentify the experimental question
3rd rowIdentify the experimental question
4th rowClassify the figure of speech: anaphora, antithesis, apostrophe, assonance, chiasmus, understatement
5th rowEvaluate tests of engineering-design solutions

Common Values

ValueCountFrequency (%)
Use guide words 697
 
3.3%
Inherited and acquired traits: use evidence to support a statement 585
 
2.8%
Compare physical and chemical changes 447
 
2.1%
Read a map: cardinal directions 400
 
1.9%
Compare magnitudes of magnetic forces 399
 
1.9%
Identify the Thirteen Colonies 398
 
1.9%
Compare properties of objects 381
 
1.8%
Classify logical fallacies 373
 
1.8%
Identify mammals, birds, fish, reptiles, and amphibians 342
 
1.6%
Use scientific names to classify organisms 341
 
1.6%
Other values (369) 16845
79.4%

Length

2023-09-17T13:16:39.806320image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
and 7516
 
6.5%
of 7115
 
6.1%
identify 6640
 
5.7%
the 5027
 
4.3%
compare 2648
 
2.3%
use 2282
 
2.0%
to 2089
 
1.8%
classify 1847
 
1.6%
a 1794
 
1.5%
or 1692
 
1.5%
Other values (710) 77499
66.7%

Most occurring characters

ValueCountFrequency (%)
94941
11.9%
e 87596
 
10.9%
t 63588
 
7.9%
s 58342
 
7.3%
a 56849
 
7.1%
n 50782
 
6.3%
i 50390
 
6.3%
o 43693
 
5.5%
r 36521
 
4.6%
d 28751
 
3.6%
Other values (54) 229321
28.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 668594
83.5%
Space Separator 94941
 
11.9%
Uppercase Letter 24987
 
3.1%
Other Punctuation 10800
 
1.3%
Dash Punctuation 753
 
0.1%
Decimal Number 645
 
0.1%
Open Punctuation 27
 
< 0.1%
Close Punctuation 27
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 87596
13.1%
t 63588
 
9.5%
s 58342
 
8.7%
a 56849
 
8.5%
n 50782
 
7.6%
i 50390
 
7.5%
o 43693
 
6.5%
r 36521
 
5.5%
d 28751
 
4.3%
c 27840
 
4.2%
Other values (17) 164242
24.6%
Uppercase Letter
ValueCountFrequency (%)
I 9033
36.2%
C 5534
22.1%
U 1820
 
7.3%
W 1014
 
4.1%
R 944
 
3.8%
G 836
 
3.3%
T 804
 
3.2%
A 779
 
3.1%
E 712
 
2.8%
D 621
 
2.5%
Other values (12) 2890
 
11.6%
Decimal Number
ValueCountFrequency (%)
0 305
47.3%
5 305
47.3%
7 27
 
4.2%
1 4
 
0.6%
8 2
 
0.3%
2 2
 
0.3%
Other Punctuation
ValueCountFrequency (%)
, 6024
55.8%
: 2551
23.6%
? 1885
 
17.5%
' 306
 
2.8%
. 34
 
0.3%
Space Separator
ValueCountFrequency (%)
94941
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 753
100.0%
Open Punctuation
ValueCountFrequency (%)
( 27
100.0%
Close Punctuation
ValueCountFrequency (%)
) 27
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 693581
86.6%
Common 107193
 
13.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 87596
12.6%
t 63588
 
9.2%
s 58342
 
8.4%
a 56849
 
8.2%
n 50782
 
7.3%
i 50390
 
7.3%
o 43693
 
6.3%
r 36521
 
5.3%
d 28751
 
4.1%
c 27840
 
4.0%
Other values (39) 189229
27.3%
Common
ValueCountFrequency (%)
94941
88.6%
, 6024
 
5.6%
: 2551
 
2.4%
? 1885
 
1.8%
- 753
 
0.7%
' 306
 
0.3%
0 305
 
0.3%
5 305
 
0.3%
. 34
 
< 0.1%
( 27
 
< 0.1%
Other values (5) 62
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 800773
> 99.9%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
94941
11.9%
e 87596
 
10.9%
t 63588
 
7.9%
s 58342
 
7.3%
a 56849
 
7.1%
n 50782
 
6.3%
i 50390
 
6.3%
o 43693
 
5.5%
r 36521
 
4.6%
d 28751
 
3.6%
Other values (53) 229320
28.6%
None
ValueCountFrequency (%)
í 1
100.0%

lecture
Categorical

Distinct262
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size165.8 KiB
3410 
Guide words appear on each page of a dictionary. They tell you the first word and last word on the page. The other words on the page come between the guide words in alphabetical order. To put words in alphabetical order, put them in order by their first letters. If the first letters are the same, look at the second letters. If the second letters are the same, look at the third letters, and so on. If one word is shorter, and there are no more letters to compare, then the shorter word comes first in alphabetical order. For example, be comes before bed.
 
598
Maps have four cardinal directions, or main directions. Those directions are north, south, east, and west. A compass rose is a set of arrows that point to the cardinal directions. A compass rose usually shows only the first letter of each cardinal direction. The north arrow points to the North Pole. On most maps, north is at the top of the map.
 
397
Scientists use scientific names to identify organisms. Scientific names are made of two words. The first word in an organism's scientific name tells you the organism's genus. A genus is a group of organisms that share many traits. A genus is made up of one or more species. A species is a group of very similar organisms. The second word in an organism's scientific name tells you its species within its genus. Together, the two parts of an organism's scientific name identify its species. For example Ursus maritimus and Ursus americanus are two species of bears. They are part of the same genus, Ursus. But they are different species within the genus. Ursus maritimus has the species name maritimus. Ursus americanus has the species name americanus. Both bears have small round ears and sharp claws. But Ursus maritimus has white fur and Ursus americanus has black fur.
 
341
Organisms, including people, have both inherited and acquired traits. Inherited and acquired traits are gained in different ways. Inherited traits are passed down from biological parents to their offspring through genes. Genes are pieces of hereditary material that contain the instructions that affect inherited traits. Offspring receive their genes, and therefore gain their inherited traits, from their biological parents. Inherited traits do not need to be learned. Acquired traits are gained during a person's life. Some acquired traits, such as riding a bicycle, are gained by learning. Other acquired traits, such as scars, are caused by the environment. Parents do not pass acquired traits down to their offspring.
 
297
Other values (257)
16165 

Length

Max length3076
Median length1283
Mean length604.0637
Min length0

Characters and Unicode

Total characters12810983
Distinct characters83
Distinct categories14 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)0.1%

Sample

1st rowMaps have four cardinal directions, or main directions. Those directions are north, south, east, and west. A compass rose is a set of arrows that point to the cardinal directions. A compass rose usually shows only the first letter of each cardinal direction. The north arrow points to the North Pole. On most maps, north is at the top of the map.
2nd rowExperiments can be designed to answer specific questions. How can you identify the questions that a certain experiment can answer? In order to do this, you need to figure out what was tested and what was measured during the experiment. Imagine an experiment with two groups of daffodil plants. One group of plants was grown in sandy soil, and the other was grown in clay soil. Then, the height of each plant was measured. First, identify the part of the experiment that was tested. The part of an experiment that is tested usually involves the part of the experimental setup that is different or changed. In the experiment described above, each group of plants was grown in a different type of soil. So, the effect of growing plants in different soil types was tested. Then, identify the part of the experiment that was measured. The part of the experiment that is measured may include measurements and calculations. In the experiment described above, the heights of the plants in each group were measured. Experiments can answer questions about how the part of the experiment that is tested affects the part that is measured. So, the experiment described above can answer questions about how soil type affects plant height. Examples of questions that this experiment can answer include: Does soil type affect the height of daffodil plants? Do daffodil plants in sandy soil grow taller than daffodil plants in clay soil? Are daffodil plants grown in sandy soil shorter than daffodil plants grown in clay soil?
3rd rowExperiments can be designed to answer specific questions. How can you identify the questions that a certain experiment can answer? In order to do this, you need to figure out what was tested and what was measured during the experiment. Imagine an experiment with two groups of daffodil plants. One group of plants was grown in sandy soil, and the other was grown in clay soil. Then, the height of each plant was measured. First, identify the part of the experiment that was tested. The part of an experiment that is tested usually involves the part of the experimental setup that is different or changed. In the experiment described above, each group of plants was grown in a different type of soil. So, the effect of growing plants in different soil types was tested. Then, identify the part of the experiment that was measured. The part of the experiment that is measured may include measurements and calculations. In the experiment described above, the heights of the plants in each group were measured. Experiments can answer questions about how the part of the experiment that is tested affects the part that is measured. So, the experiment described above can answer questions about how soil type affects plant height. Examples of questions that this experiment can answer include: Does soil type affect the height of daffodil plants? Do daffodil plants in sandy soil grow taller than daffodil plants in clay soil? Are daffodil plants grown in sandy soil shorter than daffodil plants grown in clay soil?
4th rowFigures of speech are words or phrases that use language in a nonliteral or unusual way. They can make writing more expressive. Anaphora is the repetition of the same word or words at the beginning of several phrases or clauses. We are united. We are powerful. We are winners. Antithesis involves contrasting opposing ideas within a parallel grammatical structure. I want to help, not to hurt. Apostrophe is a direct address to an absent person or a nonhuman entity. Oh, little bird, what makes you sing so beautifully? Assonance is the repetition of a vowel sound in a series of nearby words. Try to light the fire. Chiasmus is an expression in which the second half parallels the first but reverses the order of words. Never let a fool kiss you or a kiss fool you. Understatement involves deliberately representing something as less serious or important than it really is. As you know, it can get a little cold in the Antarctic.
5th rowPeople can use the engineering-design process to develop solutions to problems. One step in the process is testing if a potential solution meets the requirements of the design. How can you determine what a test can show? You need to figure out what was tested and what was measured. Imagine an engineer needs to design a bridge for a windy location. She wants to make sure the bridge will not move too much in high wind. So, she builds a smaller prototype, or model, of a bridge. Then, she exposes the prototype to high winds and measures how much the bridge moves. First, identify what was tested. A test can examine one design, or it may compare multiple prototypes to each other. In the test described above, the engineer tested a prototype of a bridge in high wind. Then, identify what the test measured. One of the criteria for the bridge was that it not move too much in high winds. The test measured how much the prototype bridge moved. Tests can show how well one or more designs meet the criteria. The test described above can show whether the bridge would move too much in high winds.

Common Values

ValueCountFrequency (%)
3410
 
16.1%
Guide words appear on each page of a dictionary. They tell you the first word and last word on the page. The other words on the page come between the guide words in alphabetical order. To put words in alphabetical order, put them in order by their first letters. If the first letters are the same, look at the second letters. If the second letters are the same, look at the third letters, and so on. If one word is shorter, and there are no more letters to compare, then the shorter word comes first in alphabetical order. For example, be comes before bed. 598
 
2.8%
Maps have four cardinal directions, or main directions. Those directions are north, south, east, and west. A compass rose is a set of arrows that point to the cardinal directions. A compass rose usually shows only the first letter of each cardinal direction. The north arrow points to the North Pole. On most maps, north is at the top of the map. 397
 
1.9%
Scientists use scientific names to identify organisms. Scientific names are made of two words. The first word in an organism's scientific name tells you the organism's genus. A genus is a group of organisms that share many traits. A genus is made up of one or more species. A species is a group of very similar organisms. The second word in an organism's scientific name tells you its species within its genus. Together, the two parts of an organism's scientific name identify its species. For example Ursus maritimus and Ursus americanus are two species of bears. They are part of the same genus, Ursus. But they are different species within the genus. Ursus maritimus has the species name maritimus. Ursus americanus has the species name americanus. Both bears have small round ears and sharp claws. But Ursus maritimus has white fur and Ursus americanus has black fur. 341
 
1.6%
Organisms, including people, have both inherited and acquired traits. Inherited and acquired traits are gained in different ways. Inherited traits are passed down from biological parents to their offspring through genes. Genes are pieces of hereditary material that contain the instructions that affect inherited traits. Offspring receive their genes, and therefore gain their inherited traits, from their biological parents. Inherited traits do not need to be learned. Acquired traits are gained during a person's life. Some acquired traits, such as riding a bicycle, are gained by learning. Other acquired traits, such as scars, are caused by the environment. Parents do not pass acquired traits down to their offspring. 297
 
1.4%
A solution is made up of two or more substances that are completely mixed. In a solution, solute particles are mixed into a solvent. The solute cannot be separated from the solvent by a filter. For example, if you stir a spoonful of salt into a cup of water, the salt will mix into the water to make a saltwater solution. In this case, the salt is the solute. The water is the solvent. The concentration of a solute in a solution is a measure of the ratio of solute to solvent. Concentration can be described in terms of particles of solute per volume of solvent. concentration = particles of solute / volume of solvent 295
 
1.4%
The atmosphere is the layer of air that surrounds Earth. Both weather and climate tell you about the atmosphere. Weather is what the atmosphere is like at a certain place and time. Weather can change quickly. For example, the temperature outside your house might get higher throughout the day. Climate is the pattern of weather in a certain place. For example, summer temperatures in New York are usually higher than winter temperatures. 294
 
1.4%
Experiments can be designed to answer specific questions. When designing an experiment, you must identify the supplies that are necessary to answer your question. In order to do this, you need to figure out what will be tested and what will be measured during the experiment. Imagine that you are wondering if plants grow to different heights when planted in different types of soil. How might you decide what supplies are necessary to conduct this experiment? First, you need to identify the part of the experiment that will be tested, which is the independent variable. This is usually the part of the experiment that is different or changed. In this case, you would like to know how plants grow in different types of soil. So, you must have different types of soil available. Next, you need to identify the part of the experiment that will be measured or observed, which is the dependent variable. In this experiment, you would like to know if some plants grow taller than others. So, you must be able to compare the plants' heights. To do this, you can observe which plants are taller by looking at them, or you can measure their exact heights with a meterstick. So, if you have different types of soil and can observe or measure the heights of your plants, then you have the supplies you need to investigate your question with an experiment! 289
 
1.4%
Organisms, including people, have both inherited and acquired traits. Inherited and acquired traits are gained in different ways. Inherited traits are passed down through families. Children gain these traits from their parents. Inherited traits do not need to be learned. Acquired traits are gained during a person's life. Some acquired traits, such as riding a bicycle, are gained by learning. Other acquired traits, such as scars, are caused by the environment. 288
 
1.4%
A letter starts with a greeting and ends with a closing. For each one, capitalize the first word and end with a comma. You should also capitalize proper nouns, such as Aunt Sue. Dear Aunt Sue, I'm glad you could come to my party, and thank you for the birthday gift. I could not have asked for a better one! Every time I see it, I think of you. With love, Rory 288
 
1.4%
Other values (252) 14711
69.4%

Length

2023-09-17T13:16:40.048904image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the 139563
 
6.3%
a 90363
 
4.1%
of 68632
 
3.1%
is 53845
 
2.4%
in 42474
 
1.9%
are 38089
 
1.7%
to 36541
 
1.6%
and 36291
 
1.6%
or 33470
 
1.5%
that 29168
 
1.3%
Other values (2908) 1657419
74.5%

Most occurring characters

ValueCountFrequency (%)
2124123
16.6%
e 1346526
 
10.5%
t 907748
 
7.1%
a 857476
 
6.7%
o 738548
 
5.8%
s 734933
 
5.7%
n 716000
 
5.6%
i 660258
 
5.2%
r 647459
 
5.1%
h 474907
 
3.7%
Other values (73) 3603005
28.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10009681
78.1%
Space Separator 2124123
 
16.6%
Other Punctuation 312519
 
2.4%
Uppercase Letter 228753
 
1.8%
Control 85966
 
0.7%
Decimal Number 25262
 
0.2%
Math Symbol 14144
 
0.1%
Dash Punctuation 6066
 
< 0.1%
Other Symbol 1481
 
< 0.1%
Close Punctuation 1012
 
< 0.1%
Other values (4) 1976
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1346526
13.5%
t 907748
 
9.1%
a 857476
 
8.6%
o 738548
 
7.4%
s 734933
 
7.3%
n 716000
 
7.2%
i 660258
 
6.6%
r 647459
 
6.5%
h 474907
 
4.7%
l 413820
 
4.1%
Other values (16) 2512006
25.1%
Uppercase Letter
ValueCountFrequency (%)
T 46578
20.4%
A 37105
16.2%
I 25308
11.1%
F 18958
8.3%
S 16398
 
7.2%
W 10545
 
4.6%
M 7849
 
3.4%
O 7395
 
3.2%
E 7034
 
3.1%
C 6296
 
2.8%
Other values (14) 45287
19.8%
Other Punctuation
ValueCountFrequency (%)
. 176513
56.5%
, 101151
32.4%
' 17058
 
5.5%
: 6325
 
2.0%
? 4057
 
1.3%
" 3768
 
1.2%
! 2528
 
0.8%
/ 907
 
0.3%
; 118
 
< 0.1%
· 94
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 10080
39.9%
0 6556
26.0%
2 2506
 
9.9%
3 1375
 
5.4%
6 1053
 
4.2%
5 990
 
3.9%
8 939
 
3.7%
7 868
 
3.4%
4 677
 
2.7%
9 218
 
0.9%
Math Symbol
ValueCountFrequency (%)
| 13370
94.5%
= 727
 
5.1%
+ 47
 
0.3%
Dash Punctuation
ValueCountFrequency (%)
- 5309
87.5%
— 757
 
12.5%
Space Separator
ValueCountFrequency (%)
2124123
100.0%
Control
ValueCountFrequency (%)
85966
100.0%
Other Symbol
ValueCountFrequency (%)
° 1481
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1012
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1012
100.0%
Modifier Symbol
ValueCountFrequency (%)
^ 746
100.0%
Format
ValueCountFrequency (%)
­ 193
100.0%
Final Punctuation
ValueCountFrequency (%)
’ 25
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10238434
79.9%
Common 2572549
 
20.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1346526
13.2%
t 907748
 
8.9%
a 857476
 
8.4%
o 738548
 
7.2%
s 734933
 
7.2%
n 716000
 
7.0%
i 660258
 
6.4%
r 647459
 
6.3%
h 474907
 
4.6%
l 413820
 
4.0%
Other values (40) 2740759
26.8%
Common
ValueCountFrequency (%)
2124123
82.6%
. 176513
 
6.9%
, 101151
 
3.9%
85966
 
3.3%
' 17058
 
0.7%
| 13370
 
0.5%
1 10080
 
0.4%
0 6556
 
0.3%
: 6325
 
0.2%
- 5309
 
0.2%
Other values (23) 26098
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12808433
> 99.9%
None 1768
 
< 0.1%
Punctuation 782
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2124123
16.6%
e 1346526
 
10.5%
t 907748
 
7.1%
a 857476
 
6.7%
o 738548
 
5.8%
s 734933
 
5.7%
n 716000
 
5.6%
i 660258
 
5.2%
r 647459
 
5.1%
h 474907
 
3.7%
Other values (68) 3600455
28.1%
None
ValueCountFrequency (%)
° 1481
83.8%
­ 193
 
10.9%
· 94
 
5.3%
Punctuation
ValueCountFrequency (%)
— 757
96.8%
’ 25
 
3.2%

solution
Categorical

Distinct12952
Distinct (%)61.1%
Missing0
Missing (%)0.0%
Memory size165.8 KiB
2006 
The second closing is correct: Its first word is capitalized, and it ends with a comma.
 
82
The first closing is correct: Its first word is capitalized, and it ends with a comma.
 
75
Children do not inherit their parent's scars. Instead, scars are caused by the environment. People can get scars after they get hurt. So, having a scar is an acquired trait.
 
74
The particles in both samples have the same average speed, but each particle in sample B has more mass than each particle in sample A. So, the particles in sample B have a higher average kinetic energy than the particles in sample A. Because the particles in sample B have the higher average kinetic energy, sample B must have the higher temperature.
 
62
Other values (12947)
18909 

Length

Max length2716
Median length1154
Mean length244.15098
Min length0

Characters and Unicode

Total characters5177954
Distinct characters101
Distinct categories14 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12157 ?
Unique (%)57.3%

Sample

1st rowTo find the answer, look at the compass rose. Look at which way the north arrow is pointing. West Virginia is farthest north.
2nd row
3rd row
4th rowThe text uses apostrophe, a direct address to an absent person or a nonhuman entity. O goddess is a direct address to a goddess, a nonhuman entity.
5th row

Common Values

ValueCountFrequency (%)
2006
 
9.5%
The second closing is correct: Its first word is capitalized, and it ends with a comma. 82
 
0.4%
The first closing is correct: Its first word is capitalized, and it ends with a comma. 75
 
0.4%
Children do not inherit their parent's scars. Instead, scars are caused by the environment. People can get scars after they get hurt. So, having a scar is an acquired trait. 74
 
0.3%
The particles in both samples have the same average speed, but each particle in sample B has more mass than each particle in sample A. So, the particles in sample B have a higher average kinetic energy than the particles in sample A. Because the particles in sample B have the higher average kinetic energy, sample B must have the higher temperature. 62
 
0.3%
Distance affects the strength of the magnetic force. But the distance between the magnets in Pair 1 and in Pair 2 is the same. So, the strength of the magnetic force is the same in both pairs. 60
 
0.3%
Olympia is the capital of Washington. 57
 
0.3%
To predict if these magnets will attract or repel, look at which poles are closest to each other. The north pole of one magnet is closest to the north pole of the other magnet. Like poles repel. So, these magnets will repel each other. 55
 
0.3%
The particles in both samples have the same average speed, but each particle in sample A has more mass than each particle in sample B. So, the particles in sample A have a higher average kinetic energy than the particles in sample B. Because the particles in sample A have the higher average kinetic energy, sample A must have the higher temperature. 53
 
0.2%
Cheyenne is the capital of Wyoming. 52
 
0.2%
Other values (12942) 18632
87.9%

Length

2023-09-17T13:16:40.284791image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the 76403
 
8.3%
is 35975
 
3.9%
a 34524
 
3.8%
of 25335
 
2.8%
in 17293
 
1.9%
to 14117
 
1.5%
and 13684
 
1.5%
it 12375
 
1.4%
are 10446
 
1.1%
that 9101
 
1.0%
Other values (14015) 665909
72.8%

Most occurring characters

ValueCountFrequency (%)
864891
16.7%
e 522339
 
10.1%
t 376878
 
7.3%
a 350915
 
6.8%
o 306545
 
5.9%
s 288929
 
5.6%
i 281655
 
5.4%
n 263806
 
5.1%
r 245406
 
4.7%
h 221921
 
4.3%
Other values (91) 1454669
28.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3991939
77.1%
Space Separator 864891
 
16.7%
Other Punctuation 134199
 
2.6%
Uppercase Letter 130571
 
2.5%
Control 31142
 
0.6%
Decimal Number 17352
 
0.3%
Dash Punctuation 4288
 
0.1%
Open Punctuation 918
 
< 0.1%
Close Punctuation 918
 
< 0.1%
Other Symbol 777
 
< 0.1%
Other values (4) 959
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 522339
13.1%
t 376878
 
9.4%
a 350915
 
8.8%
o 306545
 
7.7%
s 288929
 
7.2%
i 281655
 
7.1%
n 263806
 
6.6%
r 245406
 
6.1%
h 221921
 
5.6%
l 168057
 
4.2%
Other values (27) 965488
24.2%
Uppercase Letter
ValueCountFrequency (%)
T 30966
23.7%
S 14486
11.1%
A 14265
10.9%
I 11068
 
8.5%
B 8901
 
6.8%
P 6230
 
4.8%
C 5866
 
4.5%
L 5548
 
4.2%
M 4657
 
3.6%
N 4353
 
3.3%
Other values (16) 24231
18.6%
Other Punctuation
ValueCountFrequency (%)
. 89313
66.6%
, 30902
 
23.0%
' 6205
 
4.6%
: 3632
 
2.7%
" 2595
 
1.9%
! 501
 
0.4%
? 497
 
0.4%
* 336
 
0.3%
; 167
 
0.1%
% 38
 
< 0.1%
Other values (2) 13
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 4329
24.9%
2 3488
20.1%
0 2933
16.9%
5 1607
 
9.3%
3 1287
 
7.4%
4 1015
 
5.8%
6 743
 
4.3%
9 673
 
3.9%
8 669
 
3.9%
7 608
 
3.5%
Dash Punctuation
ValueCountFrequency (%)
- 4210
98.2%
— 67
 
1.6%
– 11
 
0.3%
Math Symbol
ValueCountFrequency (%)
> 662
98.2%
+ 10
 
1.5%
− 2
 
0.3%
Open Punctuation
ValueCountFrequency (%)
( 917
99.9%
[ 1
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 917
99.9%
] 1
 
0.1%
Space Separator
ValueCountFrequency (%)
864891
100.0%
Control
ValueCountFrequency (%)
31142
100.0%
Other Symbol
ValueCountFrequency (%)
° 777
100.0%
Modifier Symbol
ValueCountFrequency (%)
^ 280
100.0%
Final Punctuation
ValueCountFrequency (%)
’ 4
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4122510
79.6%
Common 1055444
 
20.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 522339
12.7%
t 376878
 
9.1%
a 350915
 
8.5%
o 306545
 
7.4%
s 288929
 
7.0%
i 281655
 
6.8%
n 263806
 
6.4%
r 245406
 
6.0%
h 221921
 
5.4%
l 168057
 
4.1%
Other values (53) 1096059
26.6%
Common
ValueCountFrequency (%)
864891
81.9%
. 89313
 
8.5%
31142
 
3.0%
, 30902
 
2.9%
' 6205
 
0.6%
1 4329
 
0.4%
- 4210
 
0.4%
: 3632
 
0.3%
2 3488
 
0.3%
0 2933
 
0.3%
Other values (28) 14399
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5177040
> 99.9%
None 830
 
< 0.1%
Punctuation 82
 
< 0.1%
Math Operators 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
864891
16.7%
e 522339
 
10.1%
t 376878
 
7.3%
a 350915
 
6.8%
o 306545
 
5.9%
s 288929
 
5.6%
i 281655
 
5.4%
n 263806
 
5.1%
r 245406
 
4.7%
h 221921
 
4.3%
Other values (75) 1453755
28.1%
None
ValueCountFrequency (%)
° 777
93.6%
é 16
 
1.9%
ż 9
 
1.1%
Å‚ 9
 
1.1%
á 5
 
0.6%
ñ 5
 
0.6%
í 3
 
0.4%
ó 2
 
0.2%
è 1
 
0.1%
ø 1
 
0.1%
Other values (2) 2
 
0.2%
Punctuation
ValueCountFrequency (%)
— 67
81.7%
– 11
 
13.4%
’ 4
 
4.9%
Math Operators
ValueCountFrequency (%)
− 2
100.0%

split
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size165.8 KiB
train
12726 
test
4241 
val
4241 

Length

Max length5
Median length5
Mean length4.4000849
Min length3

Characters and Unicode

Total characters93317
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowtrain
2nd rowtrain
3rd rowtrain
4th rowtest
5th rowtest

Common Values

ValueCountFrequency (%)
train 12726
60.0%
test 4241
 
20.0%
val 4241
 
20.0%

Length

2023-09-17T13:16:40.479303image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-17T13:16:40.680834image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
train 12726
60.0%
test 4241
 
20.0%
val 4241
 
20.0%

Most occurring characters

ValueCountFrequency (%)
t 21208
22.7%
a 16967
18.2%
r 12726
13.6%
i 12726
13.6%
n 12726
13.6%
e 4241
 
4.5%
s 4241
 
4.5%
v 4241
 
4.5%
l 4241
 
4.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 93317
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 21208
22.7%
a 16967
18.2%
r 12726
13.6%
i 12726
13.6%
n 12726
13.6%
e 4241
 
4.5%
s 4241
 
4.5%
v 4241
 
4.5%
l 4241
 
4.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 93317
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 21208
22.7%
a 16967
18.2%
r 12726
13.6%
i 12726
13.6%
n 12726
13.6%
e 4241
 
4.5%
s 4241
 
4.5%
v 4241
 
4.5%
l 4241
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 93317
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 21208
22.7%
a 16967
18.2%
r 12726
13.6%
i 12726
13.6%
n 12726
13.6%
e 4241
 
4.5%
s 4241
 
4.5%
v 4241
 
4.5%
l 4241
 
4.5%

pid
Real number (ℝ)

HIGH CORRELATION  UNIFORM  UNIQUE 

Distinct21208
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10604.5
Minimum1
Maximum21208
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size165.8 KiB
2023-09-17T13:16:40.837363image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1061.35
Q15302.75
median10604.5
Q315906.25
95-th percentile20147.65
Maximum21208
Range21207
Interquartile range (IQR)10603.5

Descriptive statistics

Standard deviation6122.3666
Coefficient of variation (CV)0.57733666
Kurtosis-1.2
Mean10604.5
Median Absolute Deviation (MAD)5302
Skewness0
Sum2.2490024 × 108
Variance37483373
MonotonicityStrictly increasing
2023-09-17T13:16:40.967069image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
14146 1
 
< 0.1%
14144 1
 
< 0.1%
14143 1
 
< 0.1%
14142 1
 
< 0.1%
14141 1
 
< 0.1%
14140 1
 
< 0.1%
14139 1
 
< 0.1%
14138 1
 
< 0.1%
14137 1
 
< 0.1%
Other values (21198) 21198
> 99.9%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
21208 1
< 0.1%
21207 1
< 0.1%
21206 1
< 0.1%
21205 1
< 0.1%
21204 1
< 0.1%
21203 1
< 0.1%
21202 1
< 0.1%
21201 1
< 0.1%
21200 1
< 0.1%
21199 1
< 0.1%

Interactions

2023-09-17T13:12:21.305118image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Correlations

2023-09-17T13:16:41.112008image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
pidanswertaskgradesubjecttopicsplit
pid1.0001.0001.0001.0001.0001.0001.000
answer1.0001.0000.0700.0860.2630.2190.000
task1.0000.0701.0000.1500.1450.4380.004
grade1.0000.0860.1501.0000.4190.3310.015
subject1.0000.2630.1450.4191.0000.9990.007
topic1.0000.2190.4380.3310.9991.0000.017
split1.0000.0000.0040.0150.0070.0171.000

Missing values

2023-09-17T13:16:36.779962image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-09-17T13:16:37.131554image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

questionchoicesanswerhintimagetaskgradesubjecttopiccategoryskilllecturesolutionsplitpid
0Which of these states is farthest north?[West Virginia, Louisiana, Arizona, Oklahoma]0image.pngclosed choicegrade2social sciencegeographyGeographyRead a map: cardinal directionsMaps have four cardinal directions, or main directions. Those directions are north, south, east, and west.\nA compass rose is a set of arrows that point to the cardinal directions. A compass rose usually shows only the first letter of each cardinal direction.\nThe north arrow points to the North Pole. On most maps, north is at the top of the map.To find the answer, look at the compass rose. Look at which way the north arrow is pointing. West Virginia is farthest north.train1
1Identify the question that Tom and Justin's experiment can best answer.[Do ping pong balls stop rolling along the ground sooner after being launched from a 30° angle or a 45° angle?, Do ping pong balls travel farther when launched from a 30° angle compared to a 45° angle?]1The passage below describes an experiment. Read the passage and then follow the instructions below.\n\nTom placed a ping pong ball in a catapult, pulled the catapult's arm back to a 45° angle, and launched the ball. Then, Tom launched another ping pong ball, this time pulling the catapult's arm back to a 30° angle. With each launch, his friend Justin measured the distance between the catapult and the place where the ball hit the ground. Tom and Justin repeated the launches with ping pong balls in four more identical catapults. They compared the distances the balls traveled when launched from a 45° angle to the distances the balls traveled when launched from a 30° angle.\nFigure: a catapult for launching ping pong balls.image.pngclosed choicegrade8natural sciencescience-and-engineering-practicesDesigning experimentsIdentify the experimental questionExperiments can be designed to answer specific questions. How can you identify the questions that a certain experiment can answer? In order to do this, you need to figure out what was tested and what was measured during the experiment.\nImagine an experiment with two groups of daffodil plants. One group of plants was grown in sandy soil, and the other was grown in clay soil. Then, the height of each plant was measured.\nFirst, identify the part of the experiment that was tested. The part of an experiment that is tested usually involves the part of the experimental setup that is different or changed. In the experiment described above, each group of plants was grown in a different type of soil. So, the effect of growing plants in different soil types was tested.\nThen, identify the part of the experiment that was measured. The part of the experiment that is measured may include measurements and calculations. In the experiment described above, the heights of the plants in each group were measured.\nExperiments can answer questions about how the part of the experiment that is tested affects the part that is measured. So, the experiment described above can answer questions about how soil type affects plant height.\nExamples of questions that this experiment can answer include:\nDoes soil type affect the height of daffodil plants?\nDo daffodil plants in sandy soil grow taller than daffodil plants in clay soil?\nAre daffodil plants grown in sandy soil shorter than daffodil plants grown in clay soil?train2
2Identify the question that Kathleen and Bryant's experiment can best answer.[Does Kathleen's snowboard slide down a hill in less time when it has a layer of wax or when it does not have a layer of wax?, Does Kathleen's snowboard slide down a hill in less time when it has a thin layer of wax or a thick layer of wax?]0The passage below describes an experiment. Read the passage and then follow the instructions below.\n\nKathleen applied a thin layer of wax to the underside of her snowboard and rode the board straight down a hill. Then, she removed the wax and rode the snowboard straight down the hill again. She repeated the rides four more times, alternating whether she rode with a thin layer of wax on the board or not. Her friend Bryant timed each ride. Kathleen and Bryant calculated the average time it took to slide straight down the hill on the snowboard with wax compared to the average time on the snowboard without wax.\nFigure: snowboarding down a hill.image.pngclosed choicegrade7natural sciencescience-and-engineering-practicesDesigning experimentsIdentify the experimental questionExperiments can be designed to answer specific questions. How can you identify the questions that a certain experiment can answer? In order to do this, you need to figure out what was tested and what was measured during the experiment.\nImagine an experiment with two groups of daffodil plants. One group of plants was grown in sandy soil, and the other was grown in clay soil. Then, the height of each plant was measured.\nFirst, identify the part of the experiment that was tested. The part of an experiment that is tested usually involves the part of the experimental setup that is different or changed. In the experiment described above, each group of plants was grown in a different type of soil. So, the effect of growing plants in different soil types was tested.\nThen, identify the part of the experiment that was measured. The part of the experiment that is measured may include measurements and calculations. In the experiment described above, the heights of the plants in each group were measured.\nExperiments can answer questions about how the part of the experiment that is tested affects the part that is measured. So, the experiment described above can answer questions about how soil type affects plant height.\nExamples of questions that this experiment can answer include:\nDoes soil type affect the height of daffodil plants?\nDo daffodil plants in sandy soil grow taller than daffodil plants in clay soil?\nAre daffodil plants grown in sandy soil shorter than daffodil plants grown in clay soil?train3
3Which figure of speech is used in this text?\nSing, O goddess, the anger of Achilles son of Peleus, that brought countless ills upon the Achaeans.\n—Homer, The Iliad[chiasmus, apostrophe]1Noneclosed choicegrade11language sciencefigurative-languageLiterary devicesClassify the figure of speech: anaphora, antithesis, apostrophe, assonance, chiasmus, understatementFigures of speech are words or phrases that use language in a nonliteral or unusual way. They can make writing more expressive.\nAnaphora is the repetition of the same word or words at the beginning of several phrases or clauses.\nWe are united. We are powerful. We are winners.\nAntithesis involves contrasting opposing ideas within a parallel grammatical structure.\nI want to help, not to hurt.\nApostrophe is a direct address to an absent person or a nonhuman entity.\nOh, little bird, what makes you sing so beautifully?\nAssonance is the repetition of a vowel sound in a series of nearby words.\nTry to light the fire.\nChiasmus is an expression in which the second half parallels the first but reverses the order of words.\nNever let a fool kiss you or a kiss fool you.\nUnderstatement involves deliberately representing something as less serious or important than it really is.\nAs you know, it can get a little cold in the Antarctic.The text uses apostrophe, a direct address to an absent person or a nonhuman entity.\nO goddess is a direct address to a goddess, a nonhuman entity.test4
4Which of the following could Gordon's test show?[if the spacecraft was damaged when using a parachute with a 1 m vent going 200 km per hour, how steady a parachute with a 1 m vent was at 200 km per hour, whether a parachute with a 1 m vent would swing too much at 400 km per hour]1People can use the engineering-design process to develop solutions to problems. One step in the process is testing if a potential solution meets the requirements of the design.\nThe passage below describes how the engineering-design process was used to test a solution to a problem. Read the passage. Then answer the question below.\n\nGordon was an aerospace engineer who was developing a parachute for a spacecraft that would land on Mars. He needed to add a vent at the center of the parachute so the spacecraft would land smoothly. However, the spacecraft would have to travel at a high speed before landing. If the vent was too big or too small, the parachute might swing wildly at this speed. The movement could damage the spacecraft.\nSo, to help decide how big the vent should be, Gordon put a parachute with a 1 m vent in a wind tunnel. The wind tunnel made it seem like the parachute was moving at 200 km per hour. He observed the parachute to see how much it swung.\nFigure: a spacecraft's parachute in a wind tunnel.image.pngclosed choicegrade8natural sciencescience-and-engineering-practicesEngineering practicesEvaluate tests of engineering-design solutionsPeople can use the engineering-design process to develop solutions to problems. One step in the process is testing if a potential solution meets the requirements of the design. How can you determine what a test can show? You need to figure out what was tested and what was measured.\nImagine an engineer needs to design a bridge for a windy location. She wants to make sure the bridge will not move too much in high wind. So, she builds a smaller prototype, or model, of a bridge. Then, she exposes the prototype to high winds and measures how much the bridge moves.\nFirst, identify what was tested. A test can examine one design, or it may compare multiple prototypes to each other. In the test described above, the engineer tested a prototype of a bridge in high wind.\nThen, identify what the test measured. One of the criteria for the bridge was that it not move too much in high winds. The test measured how much the prototype bridge moved.\nTests can show how well one or more designs meet the criteria. The test described above can show whether the bridge would move too much in high winds.test5
5What does the verbal irony in this text suggest?\nAccording to Mr. Herrera's kids, his snoring is as quiet as a jackhammer.[The snoring is loud., The snoring occurs in bursts.]0Noneclosed choicegrade8language sciencefigurative-languageLiterary devicesInterpret figures of speechFigures of speech are words or phrases that use language in a nonliteral or unusual way. They can make writing more expressive.\nVerbal irony involves saying one thing but implying something very different. People often use verbal irony when they are being sarcastic.\nOlivia seems thrilled that her car keeps breaking down.\nEach breakdown is as enjoyable as a punch to the face.The text uses verbal irony, which involves saying one thing but implying something very different.\nAs quiet as a jackhammer suggests that the snoring is loud. A jackhammer is not quiet, and neither is Mr. Herrera's snoring.val6
6Which animal's mouth is also adapted for bottom feeding?[discus, armored catfish]1Sturgeons eat invertebrates, plants, and small fish. They are bottom feeders. Bottom feeders find their food at the bottom of rivers, lakes, and the ocean.\nThe 's mouth is located on the underside of its head and points downward. Its mouth is adapted for bottom feeding.\nFigure: sturgeon.image.pngclosed choicegrade3natural sciencebiologyAdaptationsAnimal adaptations: beaks, mouths, and necksAn adaptation is an inherited trait that helps an organism survive or reproduce. Adaptations can include both body parts and behaviors.\nThe shape of an animal's mouth is one example of an adaptation. Animals' mouths can be adapted in different ways. For example, a large mouth with sharp teeth might help an animal tear through meat. A long, thin mouth might help an animal catch insects that live in holes. Animals that eat similar food often have similar mouths.Look at the picture of the sturgeon.\nThe sturgeon's mouth is located on the underside of its head and points downward. Its mouth is adapted for bottom feeding. The sturgeon uses its mouth to find food hidden in the sediment at the bottom of rivers, lakes, and the ocean.\nNow look at each animal. Figure out which animal has a similar adaptation.\nThe armored catfish's mouth is located on the underside of its head and points downward. Its mouth is adapted for bottom feeding.\nThe discus's mouth is not located on the underside of its head. Its mouth is not adapted for bottom feeding.val7
7Is this a sentence fragment?\nDuring the construction of Mount Rushmore, approximately eight hundred million pounds of rock from the mountain to create the monument.[no, yes]1Noneyes or nograde12language sciencewriting-strategiesSentences, fragments, and run-onsIdentify sentence fragmentsA sentence is a group of words that expresses a complete thought.\nThe band I'm in has been rehearsing daily because we have a concert in two weeks.\nA sentence fragment is a group of words that does not express a complete thought.\nRehearsing daily because we have a concert in two weeks.\nThis fragment is missing a subject. It doesn't tell who is rehearsing.\nThe band I'm in.\nThis fragment is missing a verb. It doesn't tell what the band I'm in is doing.\nBecause we have a concert in two weeks.\nThis fragment is missing an independent clause. It doesn't tell what happened because of the concert.This is a sentence fragment. It does not express a complete thought.\nDuring the construction of Mount Rushmore, approximately eight hundred million pounds of rock from the mountain to create the monument.\nHere is one way to fix the sentence fragment:\nDuring the construction of Mount Rushmore, approximately eight hundred million pounds of rock were removed from the mountain to create the monument.val8
8Which tense does the sentence use?\nMona will print her name with care.[present tense, future tense, past tense]1Noneclosed choicegrade2language scienceverbsVerb tenseIs the sentence in the past, present, or future tense?Present tense verbs tell you about something that is happening now.\nMost present-tense verbs are regular. They have no ending, or they end in -s or -es.\nTwo verbs are irregular in the present tense, to be and to have. You must remember their forms.\nPast tense verbs tell you about something that has already happened.\nMost past-tense verbs are regular. They end in -ed.\nSome verbs are irregular in the past tense. You must remember their past-tense forms.\nFuture tense verbs tell you about something that is going to happen.\nAll future-tense verbs use the word will.\nPresent | Past | Future\nwalk, walks | walked | will walk\ngo, goes | went | will goThe sentence is in future tense. You can tell because it uses will before the main verb, print. The verb tells you about something that is going to happen.train9
9Complete the sentence.\nSewing an apron is a ().[chemical change, physical change]1Noneclosed choicegrade4natural sciencechemistryPhysical and chemical changeIdentify physical and chemical changesChemical changes and physical changes are two common ways matter can change.\nIn a chemical change, the type of matter changes. The types of matter before and after a chemical change are always different.\nBurning a piece of paper is a chemical change. When paper gets hot enough, it reacts with oxygen in the air and burns. The paper and oxygen change into ash and smoke.\nIn a physical change, the type of matter stays the same. The types of matter before and after a physical change are always the same.\nCutting a piece of paper is a physical change. The cut pieces are still made of paper.\nA change of state is a type of physical change. For example, ice melting is a physical change. Ice and liquid water are made of the same type of matter: water.Sewing an apron is a physical change. The fabric and thread that make up the apron get a new shape, but the type of matter in each of them does not change.train10
questionchoicesanswerhintimagetaskgradesubjecttopiccategoryskilllecturesolutionsplitpid
21198Which continent is highlighted?[Asia, Europe, Australia, North America]1image.pngclosed choicegrade5social sciencegeographyOceans and continentsIdentify oceans and continentsA continent is one of the major land masses on the earth. Most people say there are seven continents.This continent is Europe.test21199
21199What is the direction of this push?[away from the bulldozer, toward the bulldozer]0A bulldozer clears a path for a new road. A force from the bulldozer pushes loose dirt out of the way.image.pngclosed choicegrade4natural sciencephysicsForce and motionIdentify directions of forcesA force is a push or a pull that one object applies to another. Every force has a direction.\nThe direction of a push is away from the object that is pushing.\nThe direction of a pull is toward the object that is pulling.The bulldozer pushes the loose dirt. The direction of the push is away from the bulldozer.test21200
21200Which word does not rhyme?[tree, save, bee]1Noneclosed choicegrade1language sciencephonological-awarenessRhymingWhich word does not rhyme?Rhyming words are words that end with the same sound.\nThe words tip and slip rhyme. They both end with the ip sound.\nThe words lake and make rhyme. They both end with the ake sound.\nThe words tip and lake don't rhyme. They end with different sounds.The words tree and bee rhyme. They both end with the ee sound.\nThe word save does not rhyme. It ends with a different sound.train21201
21201Which sentence uses a metaphor?[Mr. Kent's legs were as long as sunflower stalks., Mr. Kent's long legs were sunflower stalks.]1Noneclosed choicegrade4language sciencefigurative-languageLiterary devicesIdentify similes and metaphorsSimiles and metaphors are figures of speech that compare two things that are not actually alike.\nA simile compares two things by saying that one is like the other. Similes often use the words like and as.\nMy sister runs like a cheetah.\nThe sister's running and a cheetah's running are compared using the word like.\nA cheetah is known for running fast, so the simile means that the sister also runs fast.\nThe cat's fur was as dark as the night.\nThe cat's fur and the night are compared using the word as.\nThe night is dark, so the simile means that the cat's fur is also dark.\nA metaphor compares two things by saying that one of them is the other. Unlike similes, metaphors don't use the word like or as.\nThe snow formed a blanket over the town.\nThe snow and a blanket are compared without the word like or as.\nA blanket is a large piece of cloth that completely covers a bed. The metaphor makes the reader imagine that the snow becomes a blanket, covering the town completely.\nUsing similes and metaphors in your writing can help you create an interesting picture for the reader.This sentence uses a metaphor:\nMr. Kent's long legs were sunflower stalks.\nThe words legs and sunflower stalks are compared without the word like or as.\nThis sentence uses a simile:\nMr. Kent's legs were as long as sunflower stalks.\nThe words legs and sunflower stalks are compared using the word as.train21202
21202Which country is highlighted?[Trinidad and Tobago, Haiti, the Dominican Republic, Dominica]2image.pngclosed choicegrade6social sciencegeographyThe Americas: geographyIdentify and select countries of the CaribbeanThis country is the Dominican Republic.\nWhy does the Dominican Republic share its island with another country?\nThe Dominican Republic and Haiti share the island of Hispaniola. It is home to the earliest European settlements in the Americas. Christopher Columbus founded the first European settlement on the island in 1492 during his first voyage across the Atlantic.\nThough many people lived on the island before Columbus's arrival, European countries quickly began to colonize the island. Eventually France and Spain both established colonies. The Spanish colony eventually became the country of the Dominican Republic, and the French colony eventually became the country of Haiti. Today, people in the two countries speak different languages and have many cultural differences.train21203
21203What information supports the conclusion that Tom acquired this trait?[Tom's scar was caused by an accident. He cut his leg when he fell off his skateboard., Tom's scar is on his left knee. His mother also has a scar on her left knee., Tom's brother has scars on both of his knees.]0Read the description of a trait.\nTom has a scar on his left knee.Noneclosed choicegrade7natural sciencebiologyGenes to traitsInherited and acquired traits: use evidence to support a statementOrganisms, including people, have both inherited and acquired traits. Inherited and acquired traits are gained in different ways.\nInherited traits are passed down from biological parents to their offspring through genes. Genes are pieces of hereditary material that contain the instructions that affect inherited traits. Offspring receive their genes, and therefore gain their inherited traits, from their biological parents. Inherited traits do not need to be learned.\nAcquired traits are gained during a person's life. Some acquired traits, such as riding a bicycle, are gained by learning. Other acquired traits, such as scars, are caused by the environment. Parents do not pass acquired traits down to their offspring.train21204
21204Which correctly shows the title of a movie?[Return to oz, Return to Oz]1Noneclosed choicegrade4language sciencecapitalizationFormattingCapitalizing titlesIn a title, capitalize the first word, the last word, and every important word in between.\nThe Wind in the Willows James and the Giant Peach\nThese words are not important in titles:\nArticles, a, an, the\nShort prepositions, such as at, by, for, in, of, on, to, up\nCoordinating conjunctions, such as and, but, orCapitalize the first word, the last word, and every important word in between. The word to is not important, so it should not be capitalized.\nThe correct title is Return to Oz.train21205
21205Which is a complete sentence?[Amy is from Greenwood now she lives in Wildgrove., This book explains the difference between cattle and buffalo.]1Noneclosed choicegrade3language sciencewriting-strategiesSentences, fragments, and run-onsIs it a complete sentence or a run-on?A sentence is a group of words that forms a complete thought. It has both a subject and a verb.\nMy friends walk along the path.\nA run-on sentence is made up of two sentences that are joined without end punctuation or with just a comma.\nI knocked on the door it opened.\nIt started raining, we ran inside.\nTo fix a run-on sentence, separate it into two sentences. Add end punctuation after the first sentence, and capitalize the second sentence.\nI knocked on the door. It opened.\nIt started raining. We ran inside.\nYou can also fix a run-on sentence by rewriting it as a compound sentence. A compound sentence is two sentences joined by a comma and a conjunction such as and, but, or, or so.\nI knocked on the door, and it opened.\nIt started raining, so we ran inside.This book explains the difference between cattle and buffalo is a complete sentence. The subject is this book, and the verb is explains.test21206
21206What information supports the conclusion that Rick inherited this trait?[Rick's coworker also has curly hair., Rick's biological father has curly hair., Rick and his biological parents have brown hair.]1Read the description of a trait.\nRick has curly hair.Noneclosed choicegrade7natural sciencebiologyGenes to traitsInherited and acquired traits: use evidence to support a statementOrganisms, including people, have both inherited and acquired traits. Inherited and acquired traits are gained in different ways.\nInherited traits are passed down from biological parents to their offspring through genes. Genes are pieces of hereditary material that contain the instructions that affect inherited traits. Offspring receive their genes, and therefore gain their inherited traits, from their biological parents. Inherited traits do not need to be learned.\nAcquired traits are gained during a person's life. Some acquired traits, such as riding a bicycle, are gained by learning. Other acquired traits, such as scars, are caused by the environment. Parents do not pass acquired traits down to their offspring.val21207
21207Which of these states is farthest east?[North Dakota, Washington, Pennsylvania, New Mexico]2image.pngclosed choicegrade3social sciencegeographyGeographyRead a map: cardinal directionsMaps have four cardinal directions, or main directions. Those directions are north, south, east, and west.\nA compass rose is a set of arrows that point to the cardinal directions. A compass rose usually shows only the first letter of each cardinal direction.\nThe north arrow points to the North Pole. On most maps, north is at the top of the map.To find the answer, look at the compass rose. Look at which way the east arrow is pointing. Pennsylvania is farthest east.train21208